Skip to content

test: Implement MPC cluster orchestration for Rust E2E tests#2589

Merged
anodar merged 7 commits intomainfrom
2441-implement-mpc-cluster-orchestration-for-rust-e2e-tests-infra
Mar 27, 2026
Merged

test: Implement MPC cluster orchestration for Rust E2E tests#2589
anodar merged 7 commits intomainfrom
2441-implement-mpc-cluster-orchestration-for-rust-e2e-tests-infra

Conversation

@anodar
Copy link
Copy Markdown
Collaborator

@anodar anodar commented Mar 24, 2026

This is the first of two stacked changes -- this one adds the cluster logic with unimplemented!() stubs for the NEAR blockchain (sandbox & client) that are implemented in the followup change.

Notable changes:

  • got rid of near-workspaces and using near-kit instead as per Ricky's recommendation (near-workspaces getting deprecated)
  • Dropped near_node, and added near_sandbox (implemented in followup) as that's what we'll be using for running neard
  • MPCNodeSetup no longer takes near_node, rather additionally takes (chain_id, near_genesis_path, near_boot_nodes) arguments

Closes #2441

Introduce the core MPC cluster orchestration infrastructure that replaces
Python's MpcCluster. This is the first of two stacked changes -- this one
adds the cluster logic with todo!() stubs for the NEAR blockchain and
sandbox components (filled in by the next change).

Components:
- cluster.rs: MpcCluster orchestrator with full setup sequence (sandbox,
  contract deploy, account creation, attestations, domain addition, node
  spawning), node lifecycle (kill/start/restart), contract operations
  (state queries, add_domains, resharing), metrics polling, and data
  management (wipe_db, block ingestion control).
- mpc_node.rs: Updated MpcNode/MpcNodeSetup with metrics scraping (HTTP
  /metrics), wait_for_metric, set_block_ingestion, reserve_key_event_attempt,
  migration_state, wipe_db. Decoupled from NearNode -- takes genesis_path,
  boot_nodes, chain_id directly.
- sandbox.rs: NearSandbox stub (todo!() -- Docker implementation in next change)
- blockchain.rs: NearBlockchain + DeployedContract stubs (todo!() -- near-kit
  implementation in next change)

Closes #2441
@claude
Copy link
Copy Markdown

claude bot commented Mar 24, 2026

Code Review

Reviewed the full diff: new test infrastructure for MPC cluster orchestration (blockchain, cluster, mpc_node, near_sandbox modules), replacing near-workspaces/near-sandbox with near-kit and Docker sandbox stubs.

No critical issues found. The code is well-structured test scaffolding with clean module boundaries. The unimplemented!() stubs are expected given this is Change 1 of a stacked PR.

Minor observations (non-blocking):

  • create_node_accounts creates accounts sequentially — could be parallelized for faster test startup in the followup, but fine for now.
  • get_metric uses v as i64 truncation from f64, which is fine for test metrics but worth noting if metrics ever exceed i64::MAX.

✅ Approved

Copy link
Copy Markdown
Collaborator

@netrome netrome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks very good, but I'm not a fan of all the unimplemented!() code paths, that means we can't start the MpcCluster. Feels like we should have started with those components instead and then wired them up in this PR. But anyway. Not a hard blocker.

This also means MpcCluster::start is completely dead code right now, and there are no tests calling it.

Comment on lines +191 to +199
let state = self.nodes.remove(idx);
let new_state = match state {
MpcNodeState::Running(node) => MpcNodeState::Stopped(node.kill()),
MpcNodeState::Stopped(setup) => {
tracing::warn!(node = idx, "node already stopped");
MpcNodeState::Stopped(setup)
}
};
self.nodes.insert(idx, new_state);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not super happy about this but I don't have any better suggestion right now. I was surprised to learn there's no good non-panicking remove method.

If this was production code I'd advocate for an index check to prevent the potential panic if out of bounds indexes are provided, but for test code I think it's fine.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this doesn't need to be Result<()> if we don't check the bound, added that check.

Copy link
Copy Markdown
Collaborator

@netrome netrome Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

Comment on lines +373 to +385
let futures = clients.iter().map(|(i, account, client)| {
let args = args.clone();
let method = method.to_string();
async move {
self.contract
.call_from(client, &method, args)
.await
.with_context(|| format!("node {i} ({account}) failed to call {method}"))
}
});

futures::future::try_join_all(futures).await?;
Ok(())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice concurrency!

netrome
netrome previously approved these changes Mar 25, 2026
gilcu3
gilcu3 previously approved these changes Mar 26, 2026
Copy link
Copy Markdown
Contributor

@gilcu3 gilcu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although without having real tests I would not be surprised if something fails later 🙃

Left some comments, main one being the one about creating signing keys manually, we should avoid doing this type of code unless there is no alternative, and if needed, do it in a centralized place.

@anodar anodar requested review from gilcu3 and netrome March 26, 2026 10:34
gilcu3
gilcu3 previously approved these changes Mar 26, 2026
Copy link
Copy Markdown
Contributor

@gilcu3 gilcu3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

use crate::port_allocator::E2ePortAllocator;

const DEFAULT_SANDBOX_IMAGE: &str = "nearprotocol/sandbox:2.10.7";
const DEFAULT_SANDBOX_IMAGE: &str = "nearprotocol/sandbox:2.11.0-rc.3";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully we never need commits instead of tags here, as has occurred in the past.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this becomes a concern, we can support running sandbox without docker too, common functionality should (hopefully) remain same, just process management would be different.

@anodar anodar enabled auto-merge March 26, 2026 20:42
@anodar anodar added this pull request to the merge queue Mar 27, 2026
Merged via the queue into main with commit e380db8 Mar 27, 2026
24 checks passed
@anodar anodar deleted the 2441-implement-mpc-cluster-orchestration-for-rust-e2e-tests-infra branch March 27, 2026 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement MPC Cluster Orchestration for Rust E2E tests infra

3 participants